28 research outputs found

    Towards Building Deep Networks with Bayesian Factor Graphs

    Full text link
    We propose a Multi-Layer Network based on the Bayesian framework of the Factor Graphs in Reduced Normal Form (FGrn) applied to a two-dimensional lattice. The Latent Variable Model (LVM) is the basic building block of a quadtree hierarchy built on top of a bottom layer of random variables that represent pixels of an image, a feature map, or more generally a collection of spatially distributed discrete variables. The multi-layer architecture implements a hierarchical data representation that, via belief propagation, can be used for learning and inference. Typical uses are pattern completion, correction and classification. The FGrn paradigm provides great flexibility and modularity and appears as a promising candidate for building deep networks: the system can be easily extended by introducing new and different (in cardinality and in type) variables. Prior knowledge, or supervised information, can be introduced at different scales. The FGrn paradigm provides a handy way for building all kinds of architectures by interconnecting only three types of units: Single Input Single Output (SISO) blocks, Sources and Replicators. The network is designed like a circuit diagram and the belief messages flow bidirectionally in the whole system. The learning algorithms operate only locally within each block. The framework is demonstrated in this paper in a three-layer structure applied to images extracted from a standard data set.Comment: Submitted for journal publicatio

    Considerations about learning Word2Vec

    Get PDF
    AbstractDespite the large diffusion and use of embedding generated through Word2Vec, there are still many open questions about the reasons for its results and about its real capabilities. In particular, to our knowledge, no author seems to have analysed in detail how learning may be affected by the various choices of hyperparameters. In this work, we try to shed some light on various issues focusing on a typical dataset. It is shown that the learning rate prevents the exact mapping of the co-occurrence matrix, that Word2Vec is unable to learn syntactic relationships, and that it does not suffer from the problem of overfitting. Furthermore, through the creation of an ad-hoc network, it is also shown how it is possible to improve Word2Vec directly on the analogies, obtaining very high accuracy without damaging the pre-existing embedding. This analogy-enhanced Word2Vec may be convenient in various NLP scenarios, but it is used here as an optimal starting point to evaluate the limits of Word2Vec

    Optimized Realization of Bayesian Networks in Reduced Normal Form using Latent Variable Model

    Full text link
    Bayesian networks in their Factor Graph Reduced Normal Form (FGrn) are a powerful paradigm for implementing inference graphs. Unfortunately, the computational and memory costs of these networks may be considerable, even for relatively small networks, and this is one of the main reasons why these structures have often been underused in practice. In this work, through a detailed algorithmic and structural analysis, various solutions for cost reduction are proposed. An online version of the classic batch learning algorithm is also analyzed, showing very similar results (in an unsupervised context); which is essential even if multilevel structures are to be built. The solutions proposed, together with the possible online learning algorithm, are included in a C++ library that is quite efficient, especially if compared to the direct use of the well-known sum-product and Maximum Likelihood (ML) algorithms. The results are discussed with particular reference to a Latent Variable Model (LVM) structure.Comment: 20 pages, 8 figure

    Intent Classification in Question-Answering Using LSTM Architectures

    Full text link
    Question-answering (QA) is certainly the best known and probably also one of the most complex problem within Natural Language Processing (NLP) and artificial intelligence (AI). Since the complete solution to the problem of finding a generic answer still seems far away, the wisest thing to do is to break down the problem by solving single simpler parts. Assuming a modular approach to the problem, we confine our research to intent classification for an answer, given a question. Through the use of an LSTM network, we show how this type of classification can be approached effectively and efficiently, and how it can be properly used within a basic prototype responder.Comment: Presented at the 2019 Italian Workshop on Neural Networks (WIRN'19) - June 201

    An Analysis of Word2Vec for the Italian Language

    Full text link
    Word representation is fundamental in NLP tasks, because it is precisely from the coding of semantic closeness between words that it is possible to think of teaching a machine to understand text. Despite the spread of word embedding concepts, still few are the achievements in linguistic contexts other than English. In this work, analysing the semantic capacity of the Word2Vec algorithm, an embedding for the Italian language is produced. Parameter setting such as the number of epochs, the size of the context window and the number of negatively backpropagated samples is explored.Comment: Presented at the 2019 Italian Workshop on Neural Networks (WIRN'19) - June 201

    PACE: A Probabilistic Atlas for Normal Tissue Complication Estimation in Radiation Oncology

    Get PDF
    In radiation oncology, the need for a modern Normal Tissue Complication Probability (NTCP) philosophy to include voxel-based evidence on organ radio-sensitivity (RS) has been acknowledged. Here a new formalism (Probabilistic Atlas for Complication Estimation, PACE) to predict radiation-induced morbidity (RIM) is presented. The adopted strategy basically consists in keeping the structure of a classical, phenomenological NTCP model, such as the Lyman-Kutcher-Burman (LKB), and replacing the dose distribution with a collection of RIM odds, including also significant non-dosimetric covariates, as input of the model framework. The theory was first demonstrated in silico on synthetic dose maps, classified according to synthetic outcomes. PACE was then applied to a clinical dataset of thoracic cancer patients classified for lung fibrosis. LKB models were trained for comparison. Overall, the obtained learning curves showed that the PACE model outperformed the LKB and predicted synthetic outcomes with an accuracy >0.8. On the real patients, PACE performance, evaluated by both discrimination and calibration, was significantly higher than LKB. This trend was confirmed by cross-validation. Furthermore, the capability to infer the spatial pattern of underlying RS map for the analyzed RIM was successfully demonstrated, thus paving the way to new perspectives of NTCP models as learning tools

    Soils of the Aversa plain (southern Italy)

    Get PDF
    The Aversa plain is one of the most important agricultural areas of the Campania region, combining the presence of very fertile soils, sites of great archaeological interest and growing residential urbanization. In this paper, the soil map (1:50,000 scale) of the Aversa plain is presented. Three main land systems (coastal, alluvial and foothill plains) characterized by different soil types (Andosols, Phaeozems, Cambisols, Vertisols, Arenosols, Histosols, Luvisols) have been identified. However, Andosols are the most widespread soil type (9768 ha) and, along with part of the Phaeozems and Cambisols, represent the most fertile soils of the Aversa plain (first and second classes of the land capability classification). In order to evaluate recent intense soil sealing, its impact over land capability classes was assessed during the last 60 years. Results show that soil sealing in the Aversa plain affected mainly the most fertile first- and second-class soils

    A Unifying View of Estimation and Control Using Belief Propagation With Application to Path Planning

    Get PDF
    The use of estimation techniques on stochastic models to solve control problems is an emerging paradigm that falls under the rubric of Active Inference (AI) and Control as Inference (CAI). In this work, we use probability propagation on factor graphs to show that various algorithms proposed in the literature can be seen as specific composition rules in a factor graph. We show how this unified approach, presented both in probability space and in log of the probability space, provides a very general framework that includes the Sum-product, the Max-product, Dynamic programming and mixed Reward/Entropy criteria-based algorithms. The framework also expands algorithmic design options that lead to new smoother or sharper policy distributions. We propose original recursions such as: a generalized Sum/Max-product algorithm, a Smooth Dynamic programming algorithm and a modified versions of the Reward/Entropy algorithm. The discussion is carried over with reference to a path planning problem where the recursions that arise from various cost functions, although they may appear similar in scope, bear noticeable differences. We provide a comprehensive table of composition rules and a comparison through simulations, first on a synthetic small grid with a single goal with obstacles, and then on a grid extrapolated from a real-world scene with multiple goals and a semantic map

    Belief propagation and learning in convolution multi-layer factor graphs

    No full text
    In modeling time series, convolution multi-layer graphs are able to capture long-term dependence at a gradually increasing scale. We present an approach to learn a layered factor graph architecture starting from a stationary latent models for each layer. Simulations of belief propagation are reported for a three-layer graph on a small data set of characters. © 2014 IEEE
    corecore